Overview

Dataset Statistics

Number of Variables 12
Number of Rows 19158
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 60
Duplicate Rows (%) 0.3%
Total Size in Memory 1.8 MB
Average Row Size in Memory 96.0 B
Variable Types
  • Numerical: 10
  • Categorical: 2

Dataset Insights

gender is skewed Skewed
enrolled_university is skewed Skewed
education_level is skewed Skewed
major_discipline is skewed Skewed
experience is skewed Skewed
company_size is skewed Skewed
company_type is skewed Skewed
last_new_job is skewed Skewed
city_development_index is skewed Skewed
relevent_experience has constant length 3 Constant Length
target has constant length 3 Constant Length
gender has 1238 (6.46%) zeros Zeros
enrolled_university has 3757 (19.61%) zeros Zeros
education_level has 11598 (60.54%) zeros Zeros
company_size has 1471 (7.68%) zeros Zeros
last_new_job has 8040 (41.97%) zeros Zeros
  • 1
  • 2

Variables


gender

numerical

Approximate Distinct Count 83
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 0.9284
Minimum 0
Maximum 2
Zeros 1238
Zeros (%) 6.5%
Negatives 0
Negatives (%) 0.0%
  • gender is skewed left (γ1 = -2.0836)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0.9401
Median 1
Q3 1
95-th Percentile 1
Maximum 2
Range 2
IQR 0.05986

Descriptive Statistics

Mean 0.9284
Standard Deviation 0.266
Variance 0.07073
Sum 17786.5845
Skewness -2.0836
Kurtosis 9.2296
Coefficient of Variation 0.2865
  • gender is not normally distributed (p-value 2.1723614802920418e-23)
  • gender has 1429 outliers

relevent_experience

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1.2 MB
  • The largest value (0.0) is over 2.57 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 1.0
3rd row 1.0
4th row 1.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 38316
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 2.57 times larger than the second largest value (10)
  • relevent_experience has words of constant length

enrolled_university

numerical

Approximate Distinct Count 74
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 2.2714
Minimum 0
Maximum 3
Zeros 3757
Zeros (%) 19.6%
Negatives 0
Negatives (%) 0.0%
  • enrolled_university is skewed left (γ1 = -1.1731)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 1
Median 3
Q3 3
95-th Percentile 3
Maximum 3
Range 3
IQR 2

Descriptive Statistics

Mean 2.2714
Standard Deviation 1.2234
Variance 1.4967
Sum 43515.3547
Skewness -1.1731
Kurtosis -0.5059
Coefficient of Variation 0.5386
  • enrolled_university is not normally distributed (p-value 1.7522392556958625e-23)

education_level

numerical

Approximate Distinct Count 72
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 0.7068
Minimum 0
Maximum 4
Zeros 11598
Zeros (%) 60.5%
Negatives 0
Negatives (%) 0.0%
  • education_level is skewed right (γ1 = 1.1496)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 2
95-th Percentile 2
Maximum 4
Range 4
IQR 2

Descriptive Statistics

Mean 0.7068
Standard Deviation 0.9902
Variance 0.9805
Sum 13540.8301
Skewness 1.1496
Kurtosis 0.4094
Coefficient of Variation 1.4009
  • education_level is not normally distributed (p-value 5.263001110827475e-22)

major_discipline

numerical

Approximate Distinct Count 285
Approximate Unique (%) 1.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 4.674
Minimum 0
Maximum 5.1678
Zeros 253
Zeros (%) 1.3%
Negatives 0
Negatives (%) 0.0%
  • major_discipline is skewed left (γ1 = -3.4253)

Quantile Statistics

Minimum 0
5-th Percentile 2
Q1 5
Median 5
Q3 5
95-th Percentile 5
Maximum 5.1678
Range 5.1678
IQR 0

Descriptive Statistics

Mean 4.674
Standard Deviation 0.9457
Variance 0.8943
Sum 89544.1462
Skewness -3.4253
Kurtosis 11.1344
Coefficient of Variation 0.2023
  • major_discipline is not normally distributed (p-value 6.898240018512645e-25)
  • major_discipline has 4666 outliers

experience

numerical

Approximate Distinct Count 40
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 12.9151
Minimum 0
Maximum 21
Zeros 549
Zeros (%) 2.9%
Negatives 0
Negatives (%) 0.0%
  • experience is skewed left (γ1 = -0.5112)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 7
Median 14
Q3 18
95-th Percentile 21
Maximum 21
Range 21
IQR 11

Descriptive Statistics

Mean 12.9151
Standard Deviation 6.5906
Variance 43.4361
Sum 247427.7266
Skewness -0.5112
Kurtosis -0.9442
Coefficient of Variation 0.5103
  • experience is not normally distributed (p-value 8.669049786828174e-11)

company_size

numerical

Approximate Distinct Count 516
Approximate Unique (%) 2.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 3.0564
Minimum 0
Maximum 7
Zeros 1471
Zeros (%) 7.7%
Negatives 0
Negatives (%) 0.0%
  • company_size is skewed right (γ1 = 0.3793)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 2
Median 3.0099
Q3 4
95-th Percentile 7
Maximum 7
Range 7
IQR 2

Descriptive Statistics

Mean 3.0564
Standard Deviation 1.731
Variance 2.9963
Sum 58554.8156
Skewness 0.3793
Kurtosis 0.1957
Coefficient of Variation 0.5663
  • company_size is not normally distributed (p-value 1.5987691245995733e-16)

company_type

numerical

Approximate Distinct Count 583
Approximate Unique (%) 3.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 4.274
Minimum 0
Maximum 5
Zeros 603
Zeros (%) 3.1%
Negatives 0
Negatives (%) 0.0%
  • company_type is skewed left (γ1 = -2.2177)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 4.0703
Median 5
Q3 5
95-th Percentile 5
Maximum 5
Range 5
IQR 0.9297

Descriptive Statistics

Mean 4.274
Standard Deviation 1.2587
Variance 1.5844
Sum 81881.6604
Skewness -2.2177
Kurtosis 3.9305
Coefficient of Variation 0.2945
  • company_type is not normally distributed (p-value 2.779967477819714e-22)
  • company_type has 2125 outliers

last_new_job

numerical

Approximate Distinct Count 39
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 1.9212
Minimum 0
Maximum 6
Zeros 8040
Zeros (%) 42.0%
Negatives 0
Negatives (%) 0.0%
  • last_new_job is skewed right (γ1 = 0.7258)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 1
Q3 4
95-th Percentile 6
Maximum 6
Range 6
IQR 4

Descriptive Statistics

Mean 1.9212
Standard Deviation 2.1493
Variance 4.6195
Sum 36806.2371
Skewness 0.7258
Kurtosis -0.9129
Coefficient of Variation 1.1187
  • last_new_job is not normally distributed (p-value 1.4472874713434029e-18)

city_development_index

numerical

Approximate Distinct Count 93
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 0.8288
Minimum 0.448
Maximum 0.949
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • city_development_index is skewed left (γ1 = -0.9953)

Quantile Statistics

Minimum 0.448
5-th Percentile 0.624
Q1 0.74
Median 0.903
Q3 0.92
95-th Percentile 0.926
Maximum 0.949
Range 0.501
IQR 0.18

Descriptive Statistics

Mean 0.8288
Standard Deviation 0.1234
Variance 0.01522
Sum 15879.07
Skewness -0.9953
Kurtosis -0.5387
Coefficient of Variation 0.1488
  • city_development_index is not normally distributed (p-value 9.917514244939198e-21)
  • city_development_index has 17 outliers

training_hours

numerical

Approximate Distinct Count 241
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 299.3 KB
Mean 65.3669
Minimum 1
Maximum 336
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • training_hours is skewed right (γ1 = 1.8191)

Quantile Statistics

Minimum 1
5-th Percentile 7
Q1 23
Median 47
Q3 88
95-th Percentile 188
Maximum 336
Range 335
IQR 65

Descriptive Statistics

Mean 65.3669
Standard Deviation 60.0585
Variance 3607.0188
Sum 1.2523e+06
Skewness 1.8191
Kurtosis 3.8392
Coefficient of Variation 0.9188
  • training_hours is not normally distributed (p-value 0.0005325410999039795)
  • training_hours has 984 outliers

target

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1.2 MB
  • The largest value (0.0) is over 3.01 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 1.0
2nd row 0.0
3rd row 0.0
4th row 1.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 38316
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 3.01 times larger than the second largest value (10)
  • target has words of constant length

Interactions

Correlations

Missing Values